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Abstract 



We develop a hypothesis testing framework for the formulation of the problems of 1) the 
validation of a simulation model and 2) using modeling to certify the performance of a physical 
systeirj^] These results are used to solve the extrapolative validation and certification problems, 
namely problems where the regime of interest is different than the regime for which we have 
experimental data. We use concentration of measure theory to develop the tests and analyze 
their errors. This work was stimulated by the work of Lucas, Owhadi, and Ortiz [T] where 
a rigorous method of validation and certification is described and tested. In Remark |2.5| we 
describe the connection between the two approaches. Moreover, as mentioned in that work these 
results have important implications in the Quantification of Margins and Uncertainties (QMU) 
framework. In particular, in Remark |2.6| we describe how it provides a rigorous interpretation 
of the notion of confidence and new notions of margins and uncertainties which allow this 
interpretation. Since certain concentration parameters used in the above tests may be unkown, 
we furthermore show, in the last half of the paper, how to derive equally powerful tests which 
estimate them from sample data, thus replacing the assumption of the values of the concentration 
parameters with weaker assumptions. 

1 Introduction 

Validation of simulation models is clearly important and much substantial work has been directed 
towards it, see e.g. [2, El HJ 13 El d E] and the references therein. Moreover, the problem appears 
to go straight to the heart of the philosophy of science (see e.g. [91 [TUl HI])- Indeed, [12] assert 
that validation is impossible, and [1] describe a rigorous method for it. On the other hand, it 
appears that while all agree that validation is an important and difficult problem, few agree on 
what the problem actually is. In the words of G. K. Chesterton |131 pg. ix], "It isn't that they 
can't see the solution. It is that they can't see the problem." In this paper we formulate examples 
of both the problems of validation and certification as problems of constructing hypothesis tests. A 
straightforward analysis using concentration of measure theory then provides tests and guarantees 
on their performance. 

Although hypothesis tests have been used in validation before, e.g. in |14| 115]. our formulation 
is quite different. In particular, we formulate null and alternate hypotheses which represent a 
flexibility in the customer's specification of a performance design threshold. We develop tests 
that require a clear delineation of assumptions and then use concentration of measure inequalities 
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to analyze the performance of the tests. These results are then used to solve the extrapolative 
validation and certification problems, namely problems where the deployment regime is different 
than the experimental regime. This framework is then compared with that of Lucas, Owhadi 
and Ortiz pp. As mentioned in that work, these results also have important implications in the 
Quantification of Margins and Uncertainties (QMU) framework discussed in detail in [16 } 117 } [T8]. 



In particular, in Remark 2.6 we discuss how these results provide a rigorous interpretation of 
the notion of confidence and a new notion of uncertainties which allow this interpretation. Since 
certain concentration parameters used in the above tests may be unknown, we furthermore show 
how to derive equally powerful tests which estimate them from sample data, thus replacing the 
assumption of the values of the concentration parameters with much weaker assumptions. This 
humble beginning needs to be refined so that it fits better with real applications. It should also 
incorporate some of the conclusions and structure of the above-mentioned works, but we leave that 
for the future. Let us now describe our framework and formulations. At a high level we say that 
validation is the assessment of the quality of a model of a physical system and certification is using 
modeling to assess the performance of a physical system. To make these notions more specific, 
consider the following general framework which will be used for both validation and certification. 

Consider the case of a real- valued random variable U that describes the performance of a system 
and a customer who would like to have a quantitative guarantee on this performance. You inform 
the customer that you can consider a test of the hypothesis 

P(U >a)>p 

where a is the performance design threshold and p is a level of confidence. When pressed to provide 
the specific values of the parameters a and p the customer may provide values, for example a = 1000 
and p = .95. However, if you then ask him whether a = 950 and p = .93 would be acceptable, he 
might respond in the affirmative. Consequently, a more realistic test might be to test 

F(U > A) > P (1) 

where A and P are sets instead of real numbers. However, what ([!]) actually means and how to 
construct and analyze a test for it are not clear. To resolve this problem, let us introduce some 
notation. Let IA denote the set of real-valued random variables. For a G K, p G (0, 1) define the 
null hypothesis by 

% ap ■= {U eU :P(£7 >a) > p] 

and the alternative by 

JC a>p := {U eU :F(U>a) < p] . 
Consider a' < a, p' < p and suppose that U G T~ia,p H K. a > ^ . Then, since 

p > F(U > a) > F(U >a)>p 

is a contradiction, we conclude that 

H a , P n £ ,y = 0, a' <a,p'<p. (2) 

Therefore, when a' < a and p' < p we can consider a test of % a .p against /C a 'y. Now let a and p be 
specified and specify tolerance intervals A and P such that A < a and P < p, where the notation 
implies that a G A and p G P. Then by ^ we can define a test of ([!]) by testing rl a , P against /C a ',p' 
for some a' G A and p' G P. Given the freedom the tolerance intervals allow in the choice of a' and 
p' , we seek to choose them to our advantage. 
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Let us first consider the case where A = {a}, namely there is no tolerance to changing the 
design criterion. We wish to construct a test of 7i a , P against fC a y for p' G P. Let Ui,i = l,..,n 
be i.i.d. samples from U. We can form a test by composing the sample data Ui,i = 1, ..,n with 
the indicator function I a : E — > {0, 1} defined by I a (u) = l,u > a and I a (u) = 0,u < a to obtain 
Bernoulli random variables I a o XJ{. That is, we simply evaluate whether the sample points are 
greater than or equal to a or not. We form a test of W. a ,p against K, a i :P i by forming the binomial 
test of Hp against fC p > where 

U p := {X : P(X = 1) = r, P(X = 0) = 1 - r, r > p} 

and 

K v , := {X : P(X = 1) = r, P(X = 0) = 1 - r, r < p'}. 

By the Neyman- Pearson Lemma |X9^ Thm. 3.1] and [191 Thm. 3.2] we know there exists a uniformly 
most powerful test of Hp against K, p i (see e. g. [191 Ch. 3]). However, this uniformly most powerful 
test is characterized through the binomial distribution. The statement of approximate tests with 
rigorous guarantees on their type I and II errors appears, in principle, to be available but evidently 
it is no easy matter. Rigorous bounds connecting the binomial distribution to the normal can 
be found in Feller |20j and to the Poisson distribution in Anderson and Samuels [21] , Guarantees 
outside of the range of applicability of these results can be found in Slud |22j . Approximations to the 
optimal test parameters have been derived and studied empirically in Shore [231 123] and Chernoff 
[25] has analyzed the asymptotics, in particular when p' is close to p. Although a comprehensive 
rigorous analysis of this case should be completed, that is not our goal here. Instead we consider 
the case where P = {p} , where there is no tolerance to the value p, but a nontrivial tolerance in the 
design criteria A. That is, we test T-L a , P against /C a / ;P for some a' £ A. For simplicity we remove the 
p from the notation of the hypothesis spaces, that is, from now on H a ,p and K. a ^ p are denoted by 
Ha and fC a ' respectively. We will show that reducing the spaces of random variables further allows 
the development and analysis of efficient tests and that this analysis is quite elementary. The full 
problem of testing H a ,p against /C a / jP ' for a' £ A and p' £ P where both tolerance intervals are 
nontrivial might be accomplished through a combination of the above mentioned analysis and the 
results herein. To reduce the null and alternative hypothesis sets we will consider random variables 
U which are generated as U = F(X) by real functions F : X — > M. where each X is a vector random 
variable. We make assumptions on this set of functions and vector random variables that guarantee 
the degree of concentration of U about its mean in terms of a concentration parameter T> (all this 
will be clarified below) . We denote by Ujy the resulting space of real- valued random variables and 
reduce the null and alternative hypothesis spaces accordingly. Having performed this reduction, we 
will demonstrate how to construct tests of P(JJ > A) > p in terms of T> and describe their type I 
and II errors. In addition, we observe that if A is large enough compared to T> we can obtain tests 
with small type I and II errors. We disregard measurability considerations. In many applications, 
we want to validate a model or certify a physical system in the deployment regime where the real 
physical system is impossible or expensive to sample. In Section [3] we obtain the first results, as 
far as we can tell, for this extrapolation problem. 

To apply these results to validation, we let U = F(X) denote a measure of a model's fit to 
a physical system with respect to a quantity of interest. For example if, for the value x of the 
random variable X, the model predicts the strength of a material to be sm(x) and the physical 
system obtains the strength sph{x) then we might define F(x) := i^t^^^t^ji • Then surpassing 
the performance threshold a is equivalent to \sm{x) — sph(x)\ < ~. We apply the above mentioned 
result to obtain a solution to the validation problem of constructing a test of % a against K, a i using 
samples from F(X) which has small type I and II errors. To apply this result to certification, we 
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let F(X) be the performance of the physical system and M(X) be the performance of the physical 
system predicted by the model. For example, let F(x) := sph(x) be the strength of the physical 
system and M(x) := sm{x) be the strength of the physical system simulated by the model. We 
apply the above mentioned result to obtain a solution to the certification problem of constructing 
a test of H a against K. a i using samples from F(X) and M(X) which has small type I and II errors. 
Using the above mentioned tests, we observe in a quantitative way the intuitive result that if the 
validation diameter T>f-m is much smaller than the model diameter T>m, then we need much fewer 
samples of the real physical system F than the model M to certify the performance of F. In Remark 



2.5 we describe the connection to the rigorous validation and certification results of Lucas, Owhadi, 



and Ortiz [Tj. These results generalize easily to other concentration inequalities. In particular, using 
concentration theorems for non i.i.d. sampling we can, with a substantial increase in complexity, 
obtain good tests when the empirical data are not generated i.i.d. or when the components of 
the random vector X are not independent. These tests and bounds on their performance require 
knowing the values of the diameter Dp for validation and T>f~m and Dm for certification. Since 
good approximations to these values may not be known in practice, we show, beginning in Section 
|4j how to estimate them to derive equally powerful tests, replacing the assumption of the values of 
the concentration parameters with much weaker assumptions. These tests provide validation and 
certification tests with estimated diameters. 



2 Validation and Certification with Known Diameters 

Let us first describe the concentration parameter D mentioned above, Let (f2, F, IP) denote a prob- 
ability space and consider a product space X = X 1 x • • • x X m . We call a mapping X : O — > X 
a random vector with range X and will abuse notation by also using the symbol IP for the image 
probability measure on X. For a function F : X — > M we define the partial diameters to be 

Df = sup (F(x)-F(x')) j = l,..,m (3) 

x k = 

where the supremum is taken over all x, x' £ X which differ only in their j-th component. Let 
Dp := ^yjLi (-^/) 2 define the McDiarmid diameter Dp of the function F. For a vector random 
variable X and function F : X — > M we consider the random variable FoJ:!l-fl which we also 
denote by F. For the random variable F we have McDiarmid's inequality |26} Thm. 3.1, pg. 206]: 

Theorem 2.1 Let X = X 1 x • • • x X m be a Cartesian product and let F : X — ^ E have the 
McDiarmid diameter Dp. Then for any product probability measure P = [i\ g) • • • (g) [i m we have 

_2r 2 

F(F - EF > r) < e ™f 

If, for < t < 1, we define 

then we have the following useful inequalities: 

F(F -EF>r t ) <t 

F(F -EF>r t ) <t 

F(F -EF < -r t ) < t 

F(F -EF < -r t ) < t. 
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Since this theorem's only dependence on F and X is through the parameter T>p we can define the 
subset Ux> C U consisting of all real- valued random variables generated as U = F(X) for some F 
and X such that Vp < V. Let := % a DUd and K% := JC a nUx> denote null and alternative 
generated in this way and consider testing Ti^ against JC®,. We are now ready to state our main 
result which we then use to establish both validation and certification results. We describe a test 
of against KP, for a — a' bounded below in terms of T> and p. Therefore if [a', a] C A, then 
the following result provides a test of F(jJ > A) > p, with bounds on its errors. Note that the 
test is in terms of the value of a function F' : Y — > M for some random vector Y with the only 
constraint being KF' = KU. All tests in this work accept the null T~L^ by producing T = 1 and 
reject otherwise. Recall that the type I error is defined by 9i(U) := P(T = 0),U G U% and the 
type II error is defined by 9 2 {U) := P(T = 1), U G K%,. 

Theorem 2.2 Let < p < 1, > and consider U G Z^x>. Moreover, consider a vector 

random variable Y and a function F' : Y — > R wii/i diameter T>p/ < T>' such that KF' = KU. For 
< t < 1 define rt ■= -^y/log t^ 1 and r' t := ^ydogi -1 . Let < 61,62 < 1, and Zei a and a' 
satisfy a — a' > r p + ri_ p + + r^ 2 so t/iai i/ie interval [a' + ri_ p + , a — r p — r^J zs nonempty. 
Let b G [a' + ri_ p + t$ , a — r p — r£ ]. T/ien i/ie iesi T ofH^ against JC®, defined by 



T 



1, *"(!/)>& 
0, F'(y)<6 



satisfies 



h{U)<8 1 , 
h(U)<5 2 . 



The condition EF' = EC/ of Theorem 2.2 can be easily satisfied when i.i.d. samples are available. 



Therefore, in this case, it is straightforward to use Theorem 2.2 to define tests, with guarantees on 
their errors, for both validation and certification. 

Corollary 2.3 (Validation) Let U = F(X) and suppose V > Vp. Let F(Xi),i = 1, ..,n be i.i.d. 
samples of F(X) and define (F) n := r^^i-f^j) to be the sample mean. Let < p, 61,62 < 
1 and for < t < 1 define rt := ^\/logi _1 . Moreover, let a and a' satisfy a — a' > r p + 
r i-p + ^7n r <5i + ^ r<5 2 so that the interval [a' + ri_ p + -^rs 2 ,a — r p — -^rgA is nonempty. Let 
b G [a' + ri-p + ^rg 2 , a — r p — --j=rgA and consider the test T ofH% against /C^ defined by 

rp . = fl, (F)n>b 
10, (F) n <b. 



Then we have 



h(U)<6i, 
9 2 (U)<5 2 . 



As discussed in the introduction, if F(X) represents a physical system and M(X) a model of 
that system we can consider how to test the performance of F by decomposing F(X) = M{X) + 
{F{X) — M(A)) into the model component and the model deviation component. If the test accepts 
we obtain certification. We now show how to sample the model and the model deviation to test the 
performance of the physical system. In particular, the following result shows that if the validation 
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diameter Dp-M is much smaller then the model diameter T>m, then we need much fewer samples 
of the real physical system F than the model M to certify the performance of F. It is phrased in 
terms of a general decomposition F = Fx + F%. 

Corollary 2.4 (Certification) Let U = F(X) where F := Fx + F2 is the sum of two functions 
with diameters T>p 1 and T>p 2 . Let T>,T>i, and T>2 satisfy T>\ > T>p x , T>2 > T)p 2 and T> > T>\ + T>2- 
Let F\(Xi),i = l,..,m be i.i.d. samples of F\(X) and define (Fx) ni := ^- Ya=1 -^l(-^i) t° be 
the sample mean. Also let F2(X{),i = n\ + 1, ..,m + ri2 be i.i.d. samples of Fz{X.) and define 
(F2) n2 := ^- Yl^ni+i ^K-Xi) to be the sample mean of the second set of samples. For < t < 1 
define 



Pt 

and 




V\ A 



n 



v a 



Let < 81, 62 < 1 and suppose that a and a' satisfy a — a' > r p + n-p + + p,5 2 so that the interval 
[a' + ri_ p + p5 2 , a — r p — pgA is nonempty. Let b £ [a' + r%- p + ps 2 , a — r p — ps 2 ] and consider the 
test T of against /C^ defined by 

T . = fl, (Fi) ni + (F 2 )n 2 > b 
\0, {F l ) ni + {F 2 ) n2 <b. 



Then U G Ux> and we have 



0i{U)<8i, 
2 U) < 82 . 

V2„ 



Moreover, suppose T>2 < V \, ri2 > and define 



pt := -^Wlogt- 1 (4) 



n- L 

in the above conditions on a, a' and the test parameter b. Then for any T> > 2T>\ we have U G Ud, 
and 

e 2 (u)<8 1 , 



i(U) < 8- 



2 • 



Remark 2.5 (Connection with Lucas, Owhadi and Ortiz |lj) If in Theorem 2.2 we define 
b := a' + ri_ p + in terms of a performance value a' the condition for acceptance in Theorem 
can be written (F) n — a' — r'^ > ri_ p which amounts to 



2.2 



(F) n -a'-^ 2 Vlog82- 1 fl 
V ~ V 2 



> «/-log(l-p)-i 



For the case of validation in Corollary 2.3 we have T>p> < -^j=Dp and so with p = 1 — e, 82 = e', 



T> := T>p, and T>' := -j=T>p the test of Corollary 2.3 amounts essentially (with a \/2 better 
multiplicative factor in last term on the left) to the validation criterion of Lucas, Owhadi, and 
Ortiz [1, Eqn. 40] for the exact model with single performance measure (their Scenario 3). For 
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certification, we can apply Corollary |2 , 4| with the choice F\ as their F and F2 as their G — F. Using 
the inequality T>p 1+ p 2 < Dp 1 + T>p 2 and setting n\ = ni and 62 = 2e' we again obtain essentially 
(in a similar way as mentioned above) the certification criteria of [TJ Eqns. 58&59]. Moreover, 



Corollaries 2.3 and 2.4 show we can interpret the certification criteria of [T] as guarantees that the 
type II error is less than 82- If we then select a such that a — a' > r p + ri_ p + pg 1 + p$ 2 we can 
also assert that the type I error is less than 5%. In particular, if the design parameter value a' can 
tolerate being moved so that a — a' > r p + r\- v + p§ 1 + p$ 2 with <5i and 62 small, this criterion 
amounts to a hypothesis test with type I and II errors bounded by 5± and 62 respectively. In this 
sense the criteria of [1] appear to correspond with our hypothesis test but with the roles of the 
hypothesis spaces 7i a and K a i reversed. 

Remark 2.6 (Connection with QMU) For a detailed discussion of the QMU framework please 
see |16l [T7] [T5] . In the QMU framework, the confidence is evaluated in terms of a ratio jj where 
M is a margin and U is an uncertainty. The National Research Council of the National Academies 
report |16} Finding 1-1] states that "QMU is a sound and valuable framework that aids the assess- 
ment and evaluation of the confidence in the nuclear weapons stockpile." However it also states 
"There are serious and difficult problems to be resolved in uncertainty quantification, however, 
including the physical phenomena that are modeled crudely or not at all, the possibility of un- 
known unknowns, lack of computing power to guarantee the convergence of codes, and insufficient 
attention to validating experiments. Finally, they state that "Even if the uncertainties arising from 
all of the different sources were estimated, their aggregation into an overall uncertainty for a given 
quantity of interest is a problem that needs further attention. " Although we do not suggest that 
we can answer all these question now we can make some conclusions along these lines using the 



discussion of Remark 2.5 For validation, consider the ratio 

(F) n - a' 
V 

where the numerator is a "margin" and the denominator is an "uncertainty". The inequality 



, p >y-Iog(l-p)-i + -^ > / lo 8*«" 1 

shows two things. First it shows how we can interpret confidence. That is, if ¥(F > a') < p, namely 
if the performance is insufficient, then with probability less than 62 will we accept the performance 
as sufficient. Namely, our confidence is 82- Moreover, the precise definition of the uncertainty 
parameter T> shows how this parameter is aggregated so as to maintain the interpretation of the 
confidence statement. For the certification problem similar comments also apply but we get the 
added benefit of seeing how modeling uncertainties and validation uncertainties are aggregated and 
combined and how they influence the number of validation experiments needed compared to the 
number of modeling runs. 



We have used McDiarmid's inequality Theorem 2.1 as the model for concentration in this paper, 
but that is not necessary. All that was needed is a concentration parameter T>p which scales a certain 
way with sampling. In particular, concentration theorems that do not require i.i.d. sampling, for 
example the martingale difference inequality [26, Thm. 3.14, Page 224], can be applied to derive 
results similar, but more complex, to those obtained. Another example of a concentration theorem 
is the following for Lipschitz functions, Cor. 1.17]: 

Theorem 2.7 Let X = X 1 x ••• x X m be the Cartesian product of metric spaces (Xj,dj) with 
diameters Di,i = l,..,m and let T)\ := ^a=\^1- Let F : X — > R be Lipschitz with respect 
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the l\ metric d := Y^iLx^i with Lipschitz constant \F\. Then for any product probability measure 
P = fJ-i ® • ■ • <8> we have 

¥{F - EF > r) < e 2|F|2 ^ . 
The following, easy to prove, proposition shows that the previous results also apply using the 



Lipschitz concentration Theorem 2.7 



Proposition 2.8 Consider the concentration result and notation of Theorem 2.1. Then Theorem 



2.2 and Corollaries 2.3 and 2.4 hold with V replaced by 2\F\T>x- 



3 Extrapolative Validation and Certification 

In this section we consider when we want to validate a model or certify a physical system in a 
regime where the real physical system is impossible or expensive to sample. That is, suppose we 
wish to validate or certify a random variable F(X) which is expensive or impossible to sample but 
are able to sample a related random variable F(X). When samples from F{X) are unavailable we 
have the following validation result in terms of the Kolmogorov distance 

d(F, F) := max|P(F < b) -F(F < b)\ (5) 

between two random variables F and F. The corresponding certification result is very similar, but 
we omit it for brevity. 

Theorem 3.1 (Extrapolative Validation) Let U = F(X) have McDiarmid diameter V F - Let 
F(Xi),i = l,..,n be i.i.d. samples of F(X) and define (F) n := - X^ILi F{Xi) to be the sample 
mean. Let < p < 1, < 5 p < min (p, 1 — p) and suppose that F : X — > R satisfies 



d(F,F)<6, 



p ■ 



Let < 61,62 < 1 and for < t < 1 define r% := -^^/log(p — 5 P ) 1 + ^^yl°g^i 1 an d r K '■= 

^\og{\ - p - 5 P )- 1 + -^^\og5 2 l . Then if a - a' > V F (r n + r K ) the test of {¥{F > a) > p} 
versus {¥(F > a') < p} defined by 

T . fl, (F)n>a-V F r n 
}0, {F) n <a-V F r n 



satisfies 



e x <8 1 , 
e 2 <5 2 . 



When samples from F are available but more expensive than samples from F, we can use the sample 
data to estimate the Kolmogorov distance between F and F and then corporate the estimate in the 
test as discussed in Section [4] and afterwords. The following estimate is efficient in the sense that 
it uses the concentration of the Kolmogorov-Smirnov statistic of Dvoretzky, Kiefer and Wolfowitz 
|28| improved to have a tight constant by Massart |29| (see also [SU\ Thm. 12.9]) as follows: Let n 
i.i.d samples be taken from F and let P n denote its empirical measure and let n' < n i.i.d samples be 
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taken from F and let P n > denote its empirical measure. Then the Dvoretzky, Kiefer and Wolfowitz 
Theorem states that 

P n (sup |P n (F < b) - F(F < b)\ > e) < 2e~ 2ne2 



and 

P n 'fsup|P n /(F < b) -F(F < 6)1 > e] < 2e~ 2n ' e2 . 

Let us define 

d n , n ,(F,F) := sup|P n (F<6)-P n /(#<6)| 
as an estimator of the Kolmogorov distance d(F, F) defined in ^ . Then since 

\d(F,F)-d nin/ (F,F)\ = sup|P( J F<6)-P(F<6)|-sup|P n (F<6)-P n /(F<6) 

< sup|P n (F < b)-F(F < b) \ +sup|P n /(F < b)-F(F < b)\ 

beM. bSK 

we use n' < n to conclude by a simple union bound that 

F n + n '[\d(F,F)-d n , n ,(F,F)\>e) 

< P n (sup|P n (F < b)-F{F < 6)1 > -) +P"'(sup|P n /(F < 6) - P(F < 6)1 > -) 
• ■ 2 2 



1 / 2 

< 4e-2" e 



That is, we have 

F n+n '(\d(F,F) -dn, n ,(F,F)\ >e)< 4e"^' £2 . 

whose confidence form is 



d(F,F)- d n , n , (F,F) > d 21n4 + ^ 11 ^ 1 ) < 5 . (6) 



This estimation ine quality ^ can be used, along the lines of Section [4] and afterword, to prove a 
version of Theorem |3.l| where the estimate d n ^ n i (F, F) is used instead of the Kolmogorov distance 
d(F,F). Moreover, since the test and its performance depend logarithmically on this estimate, we 
should be able to obtain good tests where n! is much smaller than n. In particular, we should be 
able to obtain good tests if the Kolmogorov distance is small enough- instead of by assuming that 
it is so. However, for brevity, we do not complete this program here but move to the estimation of 
diameters in validation and certification tests. 

4 Estimation of Diameters in Hypothesis Tests 



The validation and certification results, Corollaries |2.3| and |2.4[ require the value of the diameter 
Dp for validation and T>f-m and T>m for certification. In principle the modeling and domain 
experts should have much to say about bounding these values. However, sample data should also 
say something about them. With the eventual goal of combining expert knowledge about the 
relevant diameters with information from sample data, we now proceed to describe how sample 
data can be used to estimate these diameters. This will be accomplished through an estimation 
procedure and the introduction of "higher order" concentration parameters. To that end, we now 
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invert the concentration theorem to its " confidence version" so that the diameters appear inside the 
probability statement. This allows the comparison of the diameter with an estimable parameter 
and a mechanism for incorporating estimates of these parameters in the concentration theorems 
and therefore into the definitions of tests and the analysis of their performance. 



4.1 Diameters in Concentration Theorems 

By a simple function inversion, McDiarmid's inequality can be written 

'F-KF>f(p F ,S))<8 (7) 



where f(r, 5) := -^y/logS 1 . This inversion was used in the proof of the main Theorem 
following two lemmas reformulate those parts of Theorem |2.2| which we will use as basic 
blocks for developing validation and certification tests with estimated diameters. 

Lemma 4.1 Let < p < 1 and a, a' G M and consider the functions fu '■ R 3 — > E and fx 
defined by 

f H (r,r',5) : = -^p/logp- 1 + ^V^og^ -1 ~a, 

f K (r,r',6) := y/log (1 - p)- 1 + ^= 0og5-i + a' . 

Then for all < 5 < 1 and all F,F' 6W which satisfy EF' = EF, we have 

F(F' <-f H (V F ,V F ,,5)\F £H a ) <6, 
F(F'>f K (V F ,V FI ,S)\FeJC al ) <5. 



2.2 



The 



building 



The following simple lemma shows how to use the results of Lemma 4.1 to construct hypothesis 
tests with controlled errors. It is formulated in terms of the primary variable F, a test variable F' , 
and a vector F of auxiliary variables. 

Lemma 4.2 Let T-L, JC C IA be null and alternative hypothesis spaces and let k £ N and < 5±, 82 < 
1. Consider functions qk,9h '■ U k — > R such that for all F,F' E IA there exists a vector F of 
auxiliary random variables F 3 ,j = l,..,k such that 

F' <-g H (F)\F(=u) <St, 



F' > g K (F)\F € Kj <5 2 . 

We call any such vector F admissible for F, F' . Now suppose F,F' £U and consider any admissible 
and vector F. Consider the test T of F G % against F G fC defined by 

fl. f >- 9H (F) 
(0, F'<- SH (F) 

Then if gjj(F) + gx{F) < we have 

Oi(T) < S x , 

e 2 (T) < 6 2 . 
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In general, concentration theorems can be used to establish results like Lemma 4.1 and then 



Lemma 4.2 can be used to establish a test and bound its errors. In particular, we see how the 
main theorem, Theorem |2.2[ with test point fixed at the right-hand side of the interval, can then 



be obtained b y a combined application of Lemmas |4.1| and 4.2 first apply Lemma 4.1 and then 



apply Lemma 4.2 with U := H a , K, := H a >, F := (F, F') and 



g H (F)=g H (F,F') := /jt(2>f,ZWi), 
9K(F)=g K (F,F') := f K (V F ,V F ,,5 2 ). 

However, what is important here is that we are now in a position define tests which use estimates 
of fH{V F ,V F > , Si) and /^(Pp, Up/, 62)- Since we see no efficient way of estimating the McDiarmid 
diameter T> F of a function F but we do know something about the estimation of the usual diameter 
Dq of a function G defined by 



D G 



sup (G(x) -G(x')), 



x,x'£X 



we ask whether we can estimate the McDiarmid diameter by estimating the usual diameters of a set 
of auxiliary set of functions. To that end we first introduce a relationship between the McDiarmid 
diameter and the usual diameters of a set of auxiliary observables. These latter diameters we will 



then estimate using extreme value estimators in Section 4.2. Now, ignoring for the moment the 
question of the attainment of suprema, if we define 



where 



\ x ii ■■■> x 



:= arg 
it follows that 



max 



F j (xj 



.., x m ) 



771/ * * * * 

r \X^ , . . , Xj_i, Xj , Xjj_i , . . , X r 



max I F(x± : . . , Xj—i, Xj , , •• , x m ) F(x± : . . , Xj—i, x~ , Xj+x , .. , x m ) I 

Xn.X- v ' / 



v, 



2 

pi ■ 



Namely the McDiarmid diameter is a function of diameters. However, this relation will only be 
of use to us if the functions F J ,j = 1,.., k are observable, namely, they can be evaluated. Now 
suppose we are in possession of a set F 3 , j = 1 , . . , k auxiliary observables and let D denote the 
vector of their diameters. Suppose we also have functions gn and gx such that 

f H (V F ,Vp,,5)<g H (D,5), 0<5<l, 

f K {V F ,V F ,,5)<g K {D,5), Q<8<1. 
Then since Lemma |4. 11 asserts that for all < 5 < 1 we have 

[F' <-f H (g(D),g'(D),5)\n) <5, 
f n (F'>f K {g{D),g\D) : 5)\lC) < 5, 
it follows easily that for all < 5 < 1 we have 

'F' <-g H (D,5)\u) <S, (9) 
m (F'>g K {D,5)\K) <S, (10) 
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Consequently, we can apply Lemma |4.2| to obtain tests denned instead in terms of the estimable 
functions gn(D,5^ and g^;(-D,(5). Most importantly, the inequalities ^ remain valid with the 
vector of diameters D replaced by the vector of essential diameters. When the essential diameter is 
much smaller than the given diameter, this difference can often offset the looseness corresponding 
to the error associated with estimating the essential diameter using the empirical diameter. 

Let us now give the first important example of auxiliary observables. In this case, they will be 
none other than the functions F, F' themselves, but will require the introduction of new functions, 
cf,cf' of F and F' which will have to be approximately known. To that end, define a coefficient 
of the separability cf of the function F with respect to the m components of X := YlJLi ^ as 
follows: 

Definition 4.3 Let X := YYJLi ^ be a product and consider a function F : X — > R, its diameter 
Dp, and its McDiarmid diameter T> F . We define the coefficient of separability cf with respect to 
the product X to be 

V F 

With this definition it is clear that if we define gn(F>F, Dpi , 6) '■= fH(c F D F ,c F iD F i,6) and 
gK(F>p, Dp' , 6) := fK(cpDF,cp'DF>,S), where we suppress the dependency on cp,CF', we have 

f H (p F ,V F ,,6)=g H (D F ,D F ;6), < 5 < 1, 

f K (V F ,V F ,,6)=g K (D F ,D F ,,6), 0<5<l 

and therefore 



n [F' <-g H {D F ,D F ,,6)\H) <S, 
F , >g K (D F ,D FI ,S)\JC) <S, 



Although we have now introduced a new function c F which will have to be known or well 
bounded, this function has nice properties, which we now describe, which make assuming its value 
a weaker assumption than assuming the value of a McDiarmid diameter. Let us say that a map 
4> ■ Ilj=i X J ~~ ^ l~Ij=i X' 3 is a diagonal bijection if it is a product map <f> = Hj=i 4>j such that 
cj)j : X 3 —> X' 3 is a bijection for all j = 1, ..,m. The following lemma shows that F \— > cf is a 
bounded invariant under non-singular affine transformations F i— > aF + b of the function F and a 
diagonal bijective invariant. 

Lemma 4.4 The mapping F i— > c F is a diagonal bijective invariant. Moreover, we have 

c aF +b = CF, a, b£R, a t^O 

and 

1 r - 

< cf < ym. 



m 



in 



In Example 7.2 in the Appendix we describe the attainment of the the extreme case c F = ^= 

and cf = \Jm: roughly, the lower bound is attained for functions which are separable in the m 
components and the upper bound is obtained for a function related to the Euclidean metricJ^J 

2 In personal communication, L. Gurvits has demonstrated that nontrivial lower bounds may not exist when X is 
not a product. For example it is easy to construct cases where the partial diameters are all zero and the diameter is 
not. 
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Lemma 14.21 states that conditions such as 



F'<-g H (D)\H) <8 U 
n (F' >g K {D)\K) <5 2 . 



(11) 
(12) 

and gn(D) + gx(D) < for functions of the essential diameter vector of auxiliary observables 
are sufficient to develop a good test. In the above analysis this was accomplished by knowledge 
about the coefficients of separability CF,cp' which allowed us to use the functions F and F' as their 



own auxiliary observables. However relations such as (11), where the inequalities are in terms of 



the usual diameters, can be obtained through other concentration inequalities. For example, if we 



instead appeal to the Lipschitz concentration Theorem 2.7 it is easy to obtain inequalities (11) 



from Lemma 4.1 with Dp replaced by 2\F\T>x- However, it is easy to show that 



V F < \F\V X 



indicating that McDiarmid's Theorem |2.1| provides a superior concentration guarantee. On the 
other hand, since T>x is a sum of diameters, and the supremum 



IF I := sup 



F(x) - F(x') 



can be estimated by the empirical Lipschitz coefficient 

F{Xi) - FiX, 



\F\ :-- 



sup 



X j ^X < /,i,i'=l,..,Ti d(Xi,Xi' 



it follows that \F\ and T>x can be estimated from sample data using Corollary 4.9 in Section 4.2 



However, this will require the random variable X to be observable. Consequently, when T>p has no 
readily apparent auxiliary observables (such as when no knowledge of the coefficient of separability 
cp is available), and X is observable, using the Lipschitz concentration Theorem 2.7 may prove 
fruitful. 



4.2 Concentration of Empirical Quantiles 

Since we will be concerned with the effects of estimating essential diameters using sample data, 
we now describe results, of independent interest, concerning the concentration of empirical quan- 
tiles about distributional quantiles and show how to use them to bound empirical diameters 
with respect to essential diameters. Let X be a real random variable with probability measure 
P and recall its distribution function F(£) := ¥(X < £). For < p < 1 define the quantiles 
£ p := ¥~ 1 (p) := inf {£ : F(£) > p}. We will use important properties of F and F _1 listed in The- 
orem [7j] in the Appendix. Moreover, let Xi,i = l,n be i.i.d samples from X. Let F n denote 
the corresponding empirical measure, denote by F n its corresponding distribution function, and let 
£ p := inf : F n (£) > p} denote the empirical quantiles. We will use the following improvement of 
a theorem of Serfling |31|, Thm. 2.3.2]. 

Theorem 4.5 Let < q < 1 and suppose that £ > £ ? . Then with S\ := F(£) — q we have 

i) P n (i q > < e- 2nS i 

ii) P n (4 > C) < e ~ 2(1 - F(e))+ 5*i 
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Hi) F n (£ q > < e 2F <« 
On the other hand suppose that £ < £ g . T/ien tuit/i 82 '■= q — P(£) we /iaue 



_ nS 2 

ii) P n (4 < £) < e 2F («+3 5 2 



mj P n (£ g < C) < e 2(i-F(0) 
Theorem |4.5| now gives us a good tool to compare empirical diameters with quantiles. 
Theorem 4.6 Let X be a real random variable and let X{,i = 1, .., n i.i.d. samples. For < p < 1, 

Lei D n := sup i=1 n Xj — infj =1 .. )n Xj denote the empirical range. Then we have 

ii) Suppose X is a non-negative random variable and let S n : — supj_^ n X% denote the empirical 
supremum. Then we have 



We now show how to use Theorem 4.6 to bound the the empirical diameters in terms of essential 
diameters. To that end, let X- := essinfX and X + := ess sup X. Then the essential diameter is 
D := X + — X-.. We introduce a tail function quantifying the behavior of a random variable near 
its range limit. 

Definition 4.7 Let the tail function t x corresponding to X be defined by 

r x (e) := supt e>0 (13) 

A, (14) 

Roughly speaking the function r (e) is such that the set obtained by eliminating the right and 
left tails of mass r is at least as j ar g e as the diameter. Characterization of tail behaviors lies 
as the heart of the theory of the limiting behavior of extreme order statistics (see e.g. Arov and 
Bobrov|32|, Pickands [33J, Barndorff-Neilsen [34J) and will no doubt be useful when the diameters 
are unbounded, but since we concern ourselves with the bounded case here, the tail function ( 13 ) 
appear sufficient to our needs. The following proposition provides a lower bound for r(e) in terms 
of the distribution function for X. 

Proposition 4.8 Let X be a real random variable and suppose that X_ := ess mi X and X + := 
ess sup X are finite. Then in terms of the essential diameter D := — X-, we have 



(s) > min ( F(X_ + ^-^D), 1 " P(^+ " 2(TTe) D) >' 6 > °" 
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As an elementary application, consider the case where the tails are not too thin. That is suppose 
for some oOwe have ¥(X + - x) < 1 - (|j) K ,0 < x < D and F(X_ + x) > (jj) K ,0 < x < D. We 
conclude from Proposition |4.8| that 



r x (e) > 



> 



-2(l + e), 

which for e small is t*(e) <J (f)"- For non-negative random variables we proceed similarly to 
definition ( 13 ) and define 



rf(e) 



sup (5 



6-5 > i 



A' i 



(15) 
(16) 



4.8 



imply that t+ (e) > 1 — F(tt^). We are now 



l+e' 



Similar arguments used in the proof of Proposition 
in a position to compare the empirical diameter with the essential diameter using the tail function 

T. 



Corollary 4.9 Let X be a real random variable and let X{,i 
Then 



l,..,n be i.i.d. samples from X. 



i) Let D n '. — supj_i n X% — infj = i vi)n X% denote the empirical range and let t be define by (Ti5|). 
Then for all e > we have 



n [D n < 



D 
l + e 



< 2e" 



r{'-) 



ii) Suppose X is a non-negative random variable and let S n := sup i=1 n Xj denote the empirical 
supremum and let r + be define by (15). Then for all e > we have 

X, 



Sn < 



l + e 



'"~+(<0 
< e 2 



4.3 Estimation in Hypothesis Tests 



Lemma 4.2 and the discussion thereafter shows that when fn(D) + fx{D) < (thus determining a 



relationship between the performance thresholds a, a' and the diameter D) the test of Lemma 4.2 



of ~H against K, has type / error not greater than 5\ and type II error not greater than 62- However, 
when good upper bounds on D are not known and thus it is not known if fjj(D) + ffc(D) < 0, these 
results may be of limited value. To resolve this situation we use sample data to estimate D and 
use the estimate to test the condition /h(D) + ftc{D) < 0. To develop validation and certification 
tests along the lines above will involve sequential tests. The type of test we consider we call a stop 
option hypothesis test: 

Definition 4.10 For i = 1, 2 consider a null hypothesis %i and alternative /C, of sets of real random 
variables, and a test Tj of Hi against /Q. Define the reduced hypothesis spaces 

H 2 e := (/CiUfti)nft 2 , 
K 2t := (/CiUHi)n/C 2 . 

We define the stop option test T\ < T 2 which first implements T\ and if the outcome is acceptance, 
to use T 2 to test %2e against K,2e- 





'0 


Ti 


= 




Tx<T 2 := < 


(1,0) 


Ti 


= 1, 


T 2 = 




,(1,1) 


Ti 


= 1, 


T 2 = l 
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All types of errors for the test T% < T 2 can be controlled by the following three types of errors: 

9 1 (T 1 <T 2 ) := P(Ti = 0|fti) 

MTi*T 2 ) := P({T 1 = l,T 2 = 0}|/C 1 U(HinH 2 )) 
0n(Ti<T 2 ) := P({T 1 = l,T 2 = l}|/C 1 U(^in/C 2 )) 

Since 

/Ci u {%i n h 2 ) = {iCi u Hi) nn 2 = H 2e , 
Kx u {Hi n K 2 ) = (Ki u Hi) n /C 2 = /C 2e , 

it follows that 

0u (Ti < T 2 ) = P({Ti = 1,T 2 = 0}|^ 2e )) , 

M^i * T 2 ) = F({Ti = 1,T 2 = l}|/C 2e )) . 

Consequently, the stop option test T\ -4 T 2 converts tests of Hi against JCi,i = 1,2 into a test of 
Hi against ICi and if accepted then tests H 2e against K, 2t . Since 

p(r 1 = i|/Ci) = p({r 1 = i,r 2 = o}|/c 1 )+p({r 1 = i,t 2 = i}|/Ci) 

< P({Tx = 1,T 2 = 0}|/Ci U (^1 n H 2 )) + P(m = 1,T 2 = 1}\K X U (^1 n JC 2 )) 
= On + 612, 

it follows that if all the errors 0i,0n,6i 2 , are small, then given /Ci with high probability we reject 
Hi and stop, and given Hi with high probability we accept Hi and test well on the second test 
T 2 when applied to the reduced null hypothesis H 2t against the reduced alternative K, 2e . Finally, 
we note that we can also define the errors to be conditional errors as in the conditional hypothesis 
testing framework analyzed in [35]. In the applications of this paper, one can show that given Hi 
the conditional errors are roughly the same as above, and given /Ci, the conditional errors are not 
good. However, in this case with high probability the first test will reject and stop. 

We now proceed to implement the stop option test in the validation and certification setting. To 
simplify the analysis in the following theorem, instead of first testing Hi = {fn(D) + Jk(D) < 0} 
against fCi = {fn{D) + fx{D) > 0}, (where D is the essential diameter vector) we test Hi = 
+ e)D) + f K ((l + e)D) < 0} against K x = {f H {D) + f K {D) > 0}. Also observe that this 
result is stated in terms of auxiliary variables which are sampled concomitantly with the sampling 
of the primary variable F. More general situations can be easily addressed. 

Theorem 4.11 Let H 2 and fC 2 denote null and alternate hypothesis spaces of real random vari- 
ables. Let X be a random variable with range X and probability law P, and let F : X — >• R and 
F' : X n — >• R. Consider non-observable i.i.d. samples Xi,i = l,..,n and observable F'(X\, .., X n ). 
In addition, let k be a positive integer and let F 3 ' : X — > M.,j = 1, .., k be a collection of auxiliary ob- 
servables with essential diameters D F j,j = 1, .., k. Let D := (Dpj)j = i .. & denote the corresponding 
vector of essential diameters, 

D FJ := sup F'(Xi) - Jnf F^(Xi) 

denote the empirical diameters, and D := (-Dpj)_j=i,..,fc the vector of empirical diameters. Let 
f u ■■ M fc -> R and f K :R k ^Rbe non- decreasing functions such that 

W n (F' <-f H (D)\H 2 ) < 81, 
P n (F' > f K {D)\K 2 ) < 5 2 . 
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Let e > and define 



Hi = {f H ((l + e)D) + f K ((l + e)D)<0}, 
£i = {/ fl (D) + /#)>0}. 



1, /ff((l + e)D) + f K ((1 + e)D) < 
0, / ff ((l + e)£>) + / Ji r((l + e) J D)>0. 



and define the test T\ of 7i\ against the alternative JC\ by 

T x : = 

Moreover, define the test T 2 of T~L 2 against fC 2 by 

T 2 : ■ 



(17) 



1, F'(X 1 ,..,X n )>-f H ((l + e)D) 
0, F'(X h ..,X n )<-f H ((l + e)D). 



(18) 



Finally, consider the stop option test T% < T2 and it associated errors d\,0n, and 0\ 2 , defined in 
Definition J^.tO, Then if we define 

k 

A := £F"((l + e)£ FJ <D FJ ), 



we have 



6 1 = 

?12 < ^i + A 

?12 < S 2 + A 



Moreover, for j = 1, .., fc, Zei r J denote the tail functions of F 3 defined in (IS), and let 

21og^ 

: = — ?7X" ( 19 ) 



21ogf 
ri(c) • 

T/ien if n > max (n^(<5i), nj^)) , j = we Ziave 

01 = 
#11 < 2tfi 
012 < 25 2 . 

The constraint n > n e (5) is logarithmic in 5 _1 with multiplier If r(e) is not too small then 
this is a weak constraint. For fixed sample sizes, this relation can be used to determine a lower 
bound on the size of e which can be used. 



5 Validation and Certification with Estimated Diameters 

We now present tests for validation and certification using estimated diameters. They show that 
if the coefficients of separability c are approximately known then the validation and certification 
Corollaries 2.3 and|2.4[ using the estimated diameters, are still good. 
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Corollary 5.1 (Validation with Estimated Diameters) Let a' < a, < p < 1 and let 

■H 2 = {F(F >a)>p} 
K 2 = {F(F > a!)<p}. 



With the assumptions of Theorem 4- IF let 

F {X\ ,..,X n 



with essential 



be the sample mean. Let c > cf be a known constant and consider F : X 
diameter D as the only auxiliary observable. Let r be the tail function of F defined in (IS). Let 
e > and let n e (5) be defined in (19) with k = 1. Let < 5±, 5 2 < 1 and define 

Mr) :- 



cr 



cr 



log 5 1 1 — a 



cr 



cr 



Mr) ^log(l-p)-'\ /2 „ 



loe; 5n 1 + a 



and let 



D 



sup F{Xi 

1=1, ..,71 



inf F(Xi 

i=l,..,n 



denote the empirical diameter. Moreover, consider the stop option test T\ < T 2 and its associated 
errors 9\, 6u, and 9\ 2 , as in Theorem 4-1 1 Then for n > max (n e (<5i), n e (5 2 )) we have 



Oi 







hi 


< 


2«Ji 


h2 


< 


25 2 



For certification, we now address the case where F = F\ + F 2 , where F\ can represent the model 
and F 2 represent the difference between the physical system and the model. 

Corollary 5.2 (Certification with Estimated Diameters) Let a 1 < a, < p < 1 and let 

U 2 = {F(F >a)>p} 
K 2 = {F(F > a)<p}. 

With the assumptions of Theorem \4-H\ 1st F = F\ + F 2 . Let n = n\ + n 2 and in terms of the 
observation Fi(Xi),i = 1, ..,m and F 2 (Xi), i = n\ + 1, n\ + n 2 define 



F'{X 



l , ■ ■ , x n 



(Fi) ni + (F 2 ) 



no ■ 



Let c\ > cf 1 and c 2 > cf 2 be known constants and consider F\ with essential diameter D\ and 
F 2 with essential diameter D 2 as auxiliary observables, with diameter vector D := (D\,D 2 ). Let 
r J ,j = 1,2 be the tail functions, defined in (13), of F\ and F 2 respectively. Let e > and let 
n< j{$)i3 = 1)2 be defined in (19) with k = 2. Let S\,5 2 < 1 and define 



/ff(si,s 2 ) := (cisi + c 2 s 2 )v /l °gP 1 + 



2m 



+ 



r 2 2 
c 2 i 2 



2n 2 J 



I log 8\ 
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2^2 „2„2, 

a' 



and let 



f K ( Sl ,s 2 ) := (c lSl + c 2S2 )Vlog(l -P)- 1 + y + |J) iog^ 1 + 

Dj := sup Fi(Xi)- inf F 2 pQ) 
i=l,..,m i=l,..,ni 

Z) 2 := sup F 2 pQ)- inf F 2 (X i ) 

i=m+l,..,ni+ri2 i=ni+l,..,ni+n 2 

denote the empirical diameters with empirical diameter vector D := (Di,L> 2 ). Moreover, consider 



the stop option test T\ < T 2 and its associated errors 8i,0n, and Q\ 2 , as in Theorem \4- 11 Then 
for nj > max (n^(<5i), n^((5 2 )), j = 1,2 we have 

#1 = 

On < 25i 

0u < 25 2 . 

6 Proofs 



Proof of Theorem 2.2: To begin we first prove the following simple lemma that quantifies how 



the mass constraints of the null TL^ or the alternative /Cjf imply a constraint on the value of EC/. 

Lemma 6.1Let < p < 1, a £ 1 and V > 0. For < t < 1 let r t := ^y / iogt~ T . Suppose 
U G i/ien EC/ > a — r p . Suppose U G /C^, i/ien EC/ < a + ri_ p . 

Proof; Let U G %f and suppose to the contrary that EC/ < a — r p . Then we have 

P < F(U >a)< F(U > EC/ + r p ) = F(U - EC/ > r p ) <p 

which is a contradiction, thus establishing the first assertion. Now let U G /Cf and suppose to the 
contrary that EC/ > a + ri_ p . Then 

p > P(C/ > a) > F(U > EC/ - ri_ p ) = P(C/ - EC/ > -ri_ p ) = 1 - F(U - EC/ < -ri_ p ) > p 

which is a contradiction, thus establishing the second assertion. ■ 



The confidence version of the following result essentially completes the proof of Theorem 2.2 



Lemma 6.2 With the assumptions of Theorem 2.2 let a, a' G R satisfy a — a' > r p + ri„ p so i/iai 
i/ie interval [a' + ri_ p , a — r p ] zs nonempty. Let b G [a' + ri_ p , a — r p ] and consider the test T of%^ 
against /C^ defined by 

Jl, F'(2/)>6 
lo, F'(y)<b. 



T : 

T/ien we Ziave 



2((a-r p )-b) 2 



6>i(C/) < e c' 2 

2(ft-(a / +r 1 _ p )) 2 



D 
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Proof: Suppose U G H®- Then by Lemma 6.1 we have 

b < b - (a - r p ) + EU = b - (a - r p ) + EF' 



so that by Theorem 2.1 applied to F' , we conclude that the Type I error satisfies 



01 (17) = F(F'(Y) < b) < F(F' — EF' < b — (a — r p )) < e 

we have 



< e" 



2((a-r p )-bY 



6.1 



Similarly, if we suppose U 6 /C^, by Lemma 

6 > b - (a' + n-p) + EC/ = 6 - (a' + n_ p ) + EF'. 
Consequently, the Type II error satisfies 



2 (ET) = P(F'(E) > b) < P(F' — EF' > b — (a + n_p)) < e 



2(i,-(a' + r 1 _ p )) 2 



2(b-(a / + r 1 _ p )) 2 

< e 



We are now ready to complete the proof of Theorem 2.2 It is easy to show that a, a' and b satisfy 
the assumptions of Lemma 6.2. Since it follows from the assumptions that (a — r p 



b — (a' + ri- p ) > r'g, the assertion follows from Lemma 6.2 



b > r' s and 



Proof of Corollary 2.3 : It is not hard to see that EF' = EF = EU. Let the vector variable X 
have J components and index the J n components of JIILi ^ by the map i, j t-t hij = J(i—l)+j, j = 
1, .., J, i = 1, .., n. First observe that if we define F' : Yl2=i %i ~ > K by F'(rj™ =1 X») := (F) n we have 
EF' = EF = EU. Moreover, for all j = 1,.., J,i = 1, .., n we have £>£.'. < i/Jf ,j = 1,.., J,i = l,..,n 
and therefore 



E 



F \2 



< 



1 



E 



(7? 



F )2 



; E < D 



F ) 2 



J=l,.., J,i=l,..,n j=l,.., J,i=l,..,n j'=l,..,J 

and so we conclude that Dp/ < < -5=. The assertion follows from Theorem 



11 



2.2 



Proof of Corollary 2.4 ■ As in the proof of Corollary 2J3 index the J ni+n2 components of 
nr=i" n2 %i by the map i,j h-> fc^- = J(i — 1) + j,j = l,..,J,i = l,..,ni + n%. Since 7)j^ < 
i 1 1 J - r)i '-' ; — 1, .., J it easily follows the triangle inequality in £2 and the assumptions that 



Df + Dj 2 ,j 

Vp < Vp 1 + Vp 2 < V\ + T>2 < V. Consequently, U £ Up and we can apply Theorem 2.2 To 
that end observe that F' : itflt™ 2 X i ~* R defined by F'(]J^ n2 := (Fi) m + (F 2 )„ 2 satisfies 
EF' = EFi + EF 2 = EF = EU. Moreover, it follows that Df' < -Eff 1 if 1 < i < n x and 

-Df 1 ' < ^~Df 2 if 77,1 + 1 < z < 77-1 + no. Consequently we conclude that 



< 



E «) 2 + E 

j=l,.., J,l<i<ni j=l,..,J,ni+l<i<ni+n2 

4 E K 1 ) 2 ^ E 

1 j'=l,.., J,l<i<7ii z j=l,.., J,rii + l<i<ni+n2 



Therefore, since 



E 

j=l,.., J,l<i<ni 
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and 



we conclude that 



E (Jf 

j'=l,.., J,ni+l<j<ni+n,2 



«2^f 2 < n 2 V\ 



T> F '< \ — + — 

V n l n 2 



Consequently Theorem |2 . 2| implies the first assertion. For the second observe that T>\ < T> 2 implies 



that Pp < 2?i + T>2 < 22?i which implies J7 G Z/d- Moreover, setting ni > j^^i implies that 

then implies the assertion. ■ 



/x> 2 © 2 a^t 

^F' < A/ — + — < T>\\ —• Theorem 

r — V n l n 2 — V n l 



2.2 



Proof o/ Lemma The assumption d{F,F) < S p implies that if F(F > a) > p that F(F > 
a) > p — 5 p and if F(F > a') < p that P(i ? > a') < p + <5 p . The result then follows from Corollary 

EL3 ■ 



Proof of Lemma 4-2: The first assertion is trivial and since gn{F) + gx{F) < the second 
follows from 

Q 2 = F(F' > -g H (F)\IC) < F(F' > g K (F)\lC) < 5 2 . 



Proof of Lemma \4-4 : The first assertion follows from the fact that both F ^T>p and F i— )■ Dp 
are diagonal bijective invariants. The second assertion follows from the fact that both F i— )■ T>p 
and F ^ Dp are invariant under F \— > F + b, b £ M and both transform through scaling F \— > aF 
by T> a p = \a\T>p For the second, let x,x' approximately achieve the supremum in Dp to accuracy 
e. That is F{x) — F(x') > Dp — e. Then using the product nature of X we find that 

F(x) - F(x') = F(xi, ..,x m ) - F(x[, ..,x' m ) 

= F(x 1 ,x 2 , .., x m ) - F(x' 1 ,x 2 , -.,x m ) + F(x[,x 2 , ..,x m ) - F(x' 1 ,x' 2 , ..,x m ) + 



< 



E D 

i=l 



and so conclude that Dp < YljLi + e. Since e is arbitrary we then conclude that 



D F <J2Df< 



m 



EW) : 



rriDp 



from which we conclude that 



On the other hand, since Dp > Df,j = 1, ..,m we obtain 



V F 1 
Dp Jm 



Dp > 



F\ 2 



J'=l 



and conclude that cp = ^ < 



m. 
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Proof of Theorem 4-5: Consider the first set of assertions. According to the proof of [61\ 
Thm. 2.3.2] we have 

n n 

n4 > o = p n (E K - E m > n<5 
i=i i=i 

where S\ := F(£) — q and V{ := I{X% > £,). Consequently we obtain KV{ = 1 — F(£),i = 1, ..,n. 
Applying Hoeffding's inequality |26t Eqn. 2.4] establishes the first assertion. The second assertion 
follows from [26, Thm. 2.3b] which he attributes to [36] in the binomial case. The last assertion 
follows from [26, Thm. 2.3c] by the change of variables V- := 1 — V%, which he also attributes to 
36 1 in the binomial case. 



For the second set of assertions, observe that according to the proof of |31|, Thm. 2.3.2] we have 

n n 



i=l i=l 



where 82 := q — F(£) and Wi "■= I{Xi < £). Consequently we obtain KWi = F(£),i = 1, ..,n. The 
assertions then follow in the same way as in the first set but with the role of F switched with 1 — F. 



Proof of Theorem 4-6: Since the first assertion is is clearly true when £ p — < we can 
assume £ p — > and \ < p < 1. First observe that for c,a£lwe have 



(20) 



We will address each term on the right-hand side separately using Theorem 4.5 For e > let 
p' > 1 — ^ so that we have the identities £ p ' = sup i=1 n Aj and £i- p > = sup i=1 n Aj. Since p' > p 
it follows that £ p < £ p / and consequently £ p — e < Moreover, a similar argument shows that 
Ci-p + e > £i-p'- Consequently, if we define 



1 



2e 



and 



it follows that c e < 1 and 



si— P 

gp(6- P + e) ~ ~ e) 

£p ~~ £1— p 

c e £ p + a t = £ p - e < £ p , 
Ceii-p + a t = + e > ii-pi. 



Consequently we can apply Part 2iii of Theorem 4.5 to the first term on the right-hand side of (20) 
to obtain 



sup Xi < c e £ p + a, 

1=1, ..,71 



sup Xi < t p - e) = ¥ n [i p , < £ p - e) < e 
i=l,..,n 



where 82 '■= p' — F(£ p — e). Letting p' 1— > 1 we obtain 



»( sup X i <c £ ^a £ )<e-H 1 - | M 



i=l, ..,n 
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Since F(£ p — e) < F(£ p — ) < p we then conclude that 

F n ( sup X t < c e £ P + a e ) < e~^ l - p \ 

V i=l,..,n ' 



(21) 



For the second term on the right of (20) we apply Part liii of Theorem 4.5 to obtain 



F n inf X % > ceti-p + a t )= F n ( inf X { > + e = F n ( ^ > ^ P + e < e ^-p+<) 
\i=l,..,n / \i=l,..,n / V / 

where <5i := F(£i_ p + e) — 1 + Letting p' i->- 1 and using F(£i^ p + e) > > 1-pwe obtain 

F n ( inf X, < c e £ p + a e ) < e -i™( 1 -P). (22) 

\i=l,..,n J 



We combine the inequalities (21) and (22) with (|20j) to obtain 



Since c e f 1 as e i the first assertion of the theorem follows (see e.g. [371 Thm. 1.2.7]). 

For the second assertion, observe that it is clearly true when £ p = so we can assume £ p > 0. Now 



observe that the proof of Equation (21) actually proved that 



sup 

i=l,..,n 



Since (£ p — e) f 4> as 6 4 the second assertion follows. 



Proof of Proposition 4-8 : Let 5+ := 1— F(X 



■ i- 2(i+ e ) D) and <5_ := F(X_ + 2 (i+ e ) D) and define 
(5* = min (<5__, <5 + ) — J' with 6' > 0. Then since Lemma |7.1^ sserts that F _1 (t) < x if and only if 
t < F(x) it follows that ¥- 1 (¥(X+ - wt^) D ) + 5 ') > X ^ 



2(i+e)- 

si-5* > = F- X (l - 5+ + 5') = F-\F(X + 

Moreover, since 

4> < = F- x (5_) = F- X (F(X_ + 

we conclude that 



2(i+e) ^ an< ^ therefore 



2(1 + e) 



£>) + 8') > 



2(1 + 6) 



D. 



D)) <X_ + D 



2(1 + s) 



2(1 + 8) 



The assertion then follows by letting 5' i— >■ 0. 



Proof of Corollary 4-9: We will only prove the first assertion since the proof of the second is 
essentially the same. Since the assertion is trivially true when D = we can assume this not the 
case. Then since for < p < 5(e) we have 



si— p 



the assertion follows from from Theorem 14.6 



D 



D 
1 + e 
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Proof of Theorem \4-H • Since D < D with probability 1 we find that 

0i = P n (Ti = 0|fti) 

= F n (f H ((l + e)D) + f K {(\ + e)D) > 0\f H ((l + e)D) + f K {(\ + e)D) < 0) 
< P"(/h((1 + e)D) + f K ((l + e)D) > 0|/ H ((1 + e)D) + f K ((l + e)D) < 0) 
= 

establishing the first assertion. For the second and third assertions we use following lemma. 

Lemma 6.3 With the assumptions of Theorem 4-H let F' : X n — > R and let f : M fe — > 
non- decreasing. Then we have 



P n (V > f({l + e)D)) < P n (V >/(£>)) + ^P^((l + e)l) FJ <D 
Moreover, for nj > nj(<5), j = l,..,k we have 



FJ 



J> n ((l + e)£> FJ - <D F A < 6 

3=1 



and therefore 



f>nf F ' > /((i + e )£>)) < F n (F' > f(D)) +5 
Proof: By the monotonicity of / it follows that 

{F-f((l + e)D)>0} 
C [F-f(D) >o}uU, fc =1 {(l + e)£ FJ -<D Fj } 

Consequently, we obtain the first assertion: 



n (F' >f{(l + e)D)) < F n (F'>f(D))+Y,W n ((l + e)D FJ <D F] 



j'=i 

k 



» n {F' > f(D)) + ^P n ((l + e)D FJ < D F jj . 

3=1 



For the second, observe that by Corollary 4.9 we find that for each j we have 



(1 + e)D FJ < D F j 



l + e)D FJ < D FJ ) < 2e s~. 



Since the assumption n > n e -{5), defined in (19), implies that 



2e —<t, j = h-,k 



the second assertion follows from the first. 



24 



We now proceed to the second and third assertions of Theorem |4.11 Observe that 

flu = F n (T 1 = l,T 2 = 0\n 2e ) 

< p n (r 2 = o|^ 2 ) 

= P"(F'<-/ H ((1 + e)D)\H 2 ) 



Since A = £j =1 P n ( (1 + e)t) Fi <D F A, Lemma 



6.3 



(Equation 



23 ) applied to —F' then shows that 



0ii < P n (F' < -/ff((l + e)D)\U 2 ) < F n (F' < -f H {D)\U 2 ) + A 

thus establishing the second assertion. Since T\ = 1 implies that /fj((l + + /rt((1 + e )-^) < 
we find that 

0i 2 = p n (Ti = i,r 2 = i|/c 2e ) 

= P"(T 1 = l,F'>-/i f ((l + e )D)|^) 

< lP n (r 1 = l,F'>/ K :((l + e)5)|/C 2c ) 

< P n (F' >fK«l + e)D)\JC 2 ) 

As in the previous case, Lemma |6.3| then shows that 

012 < P n (F' > f K ((1 + e)£>)|/C 2 ) < P n (F' > /xP)|/C 2 ) + A 
thus establishing the third assertion. 

The last set of assertions follows by observing that the assumption n > max (nj(5i),rij(5 2 )),j = 



1, .., k and Lemma 6.3 implies that A < min (<5i, <5 2 ). 



Proof of Corollary 5.1: Since EF' = EF, Lemma 4.1 implies that 

P"(V < -/^f.ZWi)^) < S 1} 
W n (F' >f' K {V F ,V F ,,5 2 )\K 2 ) <5 2 , 



where 



f' H {ri,r 2 ,6) := ^^/logd- 1 + ~^V^E 



P — a, 



f' K (ri,r 2 ,S) 



^V^gS^ + ^ 0og (1 - + a' 



By Definition 4.3 we have T> F < cD. Moreover, the proof of Corollary 2.3 shows that 

_ 1 _ cD 

V F > < —^.V F < —=. 
In \ n 



Consequently, we have f' H (Vp,Vp/,5i) < fn{D) and f' K (Vp,T>p/,5 2 ) < /k(D) and therefore 

W n (F' <-f H (D)\H 2 ) <S U 

P"(f' > f K {D)\K 2 ) <S 2 , 
The assertion then follows from Theorem 14.111 
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Proof of Corollary 5.2: As in the proof of Corollary 5.1 since ¥F' = ¥F, Lemma 4.1 implies 
that 

^F 1 <-f H {V F ,V F ,,S X )\H^ <6 1 , 



\F' >f K (V F ,V F ,,8 2 )\lC 2 ) <6 2 



where 



/fl-(ri,r 2 ,5) := -^Vlogp" 1 + -j=^/]og8 1 -a, 



>'2 



V2 



V2 



f' K (,r h r 2 ,S) := ^= Vbg (1 - P)- 1 + ^= Vlog^ 1 + a' 



By Definition 4.3 we have T>p 1 < ci-Di and 2?f 2 < c 2 D 2 . Therefore it follows that 

P F = P Fl +F 2 < V Fl + V F2 < cxDi + c 2 D 2 . 
Moreover, the proof of Corollary |2.4| implies that 



v F ,<J v k + v k<<m + ^l. 

y ni n 2 y ni re 2 
Consequently, we have f' H {T>FiT^F' ■> &l) — fn{D) and f' K {T>F,T>Fi ,5 2 ) < fx(D) and therefore 

<-/#(£>) |W) <«5i, 
F' >/#(£>) |/C 2 ) <^2, 



The assertion then follows from Theorem I4JJJ 



7 Appendix 

The following Lemma from |31|. Lem. 1.1.4 Sz Sec. 2.3] lists important properties of the distribution 
function ¥(x) := F(X < x) and its corresponding quantile function F _1 (t) := inf {x : ¥(x) > t}. 

Lemma 7.1 Let ¥ be a distribution function. Then ¥ is right continuous and the function F _1 , < 
t < 1 is non- decreasing, left continuous and satisfies 

i) F _1 (F(x)) < x, -oo < x < oo . 

%%) F(F- 1 (t)) > t > F(F- 1 (t)-),0 <t < 1. 

Hi) ¥(x) > t if and only if x > ¥~ 1 (t). 

Example 7.2 (Extreme values of cp) Let F(x) := Y^=\ F j( x j)- Then since F(x) - F(x') = 
Y%Li {Fjixj) ~ F j(. x 'j)) ^ follows that 

m 

D F = z~2 D i'- 

i=i 
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Moreover, since Dj =D^,j = l,..,mwe obtain 

m 
3=1 

and therefore 

In particular, when ZK 3 = D 1 ,j = l,,m we obtain cp = ^7^. On the other hand, let X := 
[0, l] m C R m and let F(x) := ||x||, ||x|| < 1 and F(x) := 0, ||x|| > 1 where ||x|| is the Euclidean 
norm of x. Then it is easy to see that Dp = 1, -Dj 7 = l,j = l,..,m and therefore 2?^ = m. 
Consequently in this case we obtain cf = \fm. 
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